Goto

Collaborating Authors

 reference game



df308fd90635b28d82558cf580c73ed9-AuthorFeedback.pdf

Neural Information Processing Systems

We sincerely thank all reviewers for their feedback. We will clarify the issues raised below and incorporate them into our final version. R2: Reactive baseline, define N + M (L144). "Sequence of episodes" refers to We will extend this with more examples in the main paper. Modeling users' understanding of an AI's mind [A] could provide an explanation component to teach users about the We would like to disentangle two orthogonal aspects of communication, i.e. modeling of other agents and language Our model with emergent language is an interesting extension.


9 Supplement Overview

Neural Information Processing Systems

This document contains supplementary material for "Dialog without Dialog Data: Learning Visual The main paper excludes some details which we provide here. Section 12 reports the ablations we use to evaluate the effects of different aspects of the proposed Q-bot. This section describes our architecture in more detail. There is a minor notation difference between this section and the main paper. Note that for the planner there is an additional residual connection at line 16 which augments the hidden state.




A Model, training, and dataset details All models are trained end-to-end with the Gumbel-Softmax [

Neural Information Processing Systems

Models are trained on a single Titan Xp GPU on an internal cluster. Training time is typically 6-8 hours on 4 CPUs and 32GB of RAM. We train with batch size B = 128 . Like ShapeWorld, RNN encoders and decoders are single layer GRUs with hidden size 1024 and embedding size 500. For additional example games from both datasets, see Figure S1.